MEGGASENSE - The Metagenome/Genome Annotated Sequence Natural Language Search Engine: A Platform for 
the Construction of Sequence Data Warehouses.

نویسندگان

  • Ranko Gacesa
  • Jurica Zucko
  • Solveig K Petursdottir
  • Elisabet Eik Gudmundsdottir
  • Olafur H Fridjonsson
  • Janko Diminic
  • Paul F Long
  • John Cullum
  • Daslav Hranueli
  • Gudmundur O Hreggvidsson
  • Antonio Starcevic
چکیده

The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya. The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

pfsearchV3: a code acceleration and heuristic to search PROSITE profiles

SUMMARY The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved...

متن کامل

Single Nucleotide Polymorphisms and Association Studies: A Few Critical Points

Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...

متن کامل

ارزیابی خودکار جویش‌گرهای ویدئویی حوزه وب فارسی بر اساس تجمیع آرا

Today, the growth of the internet and its high influence in individuals’ life have caused many users to solve their daily needs by search engines and hence, the search engines need to be modified and continuously improved. Therefore, evaluating search engines to determine their performance is of paramount importance. In Iran, as well as other countries, extensive researches are being performed ...

متن کامل

Comparative bioinformatics analysis of a wild diploid Gossypium with two cultivated allotetraploid species

Background: Gossypium thurberi is a wild diploid species that has been used to improve cultivated allotetraploid cotton. G. thurberi belongs to D genome, which is an important wild bio-source for the cotton breeding and genetic research. To a certain degree, chloroplast DNA sequence information are a versatile tool for species identification and phylogenetic implications in plants. Different ch...

متن کامل

GENETIC AND TABU SEARCH ALGORITHMS FOR THE SINGLE MACHINE SCHEDULING PROBLEM WITH SEQUENCE-DEPENDENT SET-UP TIMES AND DETERIORATING JOBS

 This paper introduces the effects of job deterioration and sequence dependent set- up time in a single machine scheduling problem. The considered optimization criterion is the minimization of the makespan (Cmax). For this purpose, after formulating the mathematical model, genetic and tabu search algorithms were developed for the problem. Since population diversity is a very important issue in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Food technology and biotechnology

دوره 55 2  شماره 

صفحات  -

تاریخ انتشار 2017